Speech Reception in Young Children with Autism Is Selectively Indexed by a Neural Oscillation Coupling Anomaly

Communication difficulties are one of the core criteria in diagnosing autism spectrum disorder (ASD), and are often characterized by speech reception difficulties, whose biological underpinnings are not yet identified. This deficit could denote atypical neuronal ensemble activity, as reflected by neural oscillations. Atypical cross-frequency oscillation coupling, in particular, could disrupt the joint tracking and prediction of dynamic acoustic stimuli, a dual process that is essential for speech comprehension. Whether such oscillatory anomalies already exist in very young children with ASD, and with what specificity they relate to individual language reception capacity is unknown. We collected neural activity data using electroencephalography (EEG) in 64 very young children with and without ASD (mean age 3; 17 females, 47 males) while they were exposed to naturalistic-continuous speech. EEG power of frequency bands typically associated with phrase-level chunking (δ, 1–3 Hz), phonemic encoding (low-γ, 25–35 Hz), and top-down control (β, 12–20 Hz) were markedly reduced in ASD relative to typically developing (TD) children. Speech neural tracking by δ and θ (4–8 Hz) oscillations was also weaker in ASD compared with TD children. After controlling gaze-pattern differences, we found that the classical θ/γ coupling was replaced by an atypical β/γ coupling in children with ASD. This anomaly was the single most specific predictor of individual speech reception difficulties in ASD children. These findings suggest that early interventions (e.g., neurostimulation) targeting the disruption of β/γ coupling and the upregulation of θ/γ coupling could improve speech processing coordination in young children with ASD and help them engage in oral interactions. SIGNIFICANCE STATEMENT Very young children already present marked alterations of neural oscillatory activity in response to natural speech at the time of autism spectrum disorder (ASD) diagnosis. Hierarchical processing of phonemic-range and syllabic-range information (θ/γ coupling) is disrupted in ASD children. Abnormal bottom-up (low-γ) and top-down (low-β) coordination specifically predicts speech reception deficits in very young ASD children, and no other cognitive deficit.


Introduction
Autism spectrum disorder (ASD) primarily refers to disorders of social interactions; however, individuals with ASD are seldom spared from language difficulties.Many people with ASD have severe language impairment, and even high-functioning ASD individuals with excellent language skills have difficulty understanding speech in noisy environments and/or when exposed to multiple speakers (Beker et al., 2018;Schelinski and von Kriegstein, 2020).Individuals with ASD occasionally report that during childhood, speech was completely or partially unintelligible (Boucher, 2012;Kasari et al., 2014) and that speech reception was especially laborious if it contained many consonants, resulting in a speech that sounded like a sequence of vowels (McKeever et al., 2019).Furthermore, even when they have an excellent language level, individuals with ASD often exhibit atypical elocution (Nadig and Shaw, 2012), which presumably further denotes speech reception anomalies (Paul et al., 2005).
A large stream of recent studies shows that neural oscillations, i.e., the synchronous activity of neuronal populations, play a critical role in speech reception, mainly in parsing the speech flow into meaningful linguistic units: phonemes, syllables, words, phrases, etc. (Poeppel, 2003).While the 25-to 35-Hz low-g oscillation rhythm is argued to work as a basic speech sampling rhythm enabling the encoding of phonemic-level acoustic features (Lehongre et al., 2011;Hyafil et al., 2015a;Hovsepyan et al., 2020), the 4-to 7-Hz u rhythm (Giraud and Poeppel, 2012;Ghitza, 2013;Mai et al., 2016;Giraud, 2020) can flexibly track syllable boundaries, hence playing a key role in speech intelligibility (Hyafil et al., 2015a;Pefkou et al., 2017).On the other hand, the slower 1-to 3-Hz d rhythm has an endogenous role in prosodic processing (Ghitza, 2017;Teoh et al., 2019) and phrasal chunking (Ding et al., 2014), while the 12-to 20-Hz low-b rhythm conveys top-down information, presumably at an intermediate timescale between phonemes and syllables (Pefkou et al., 2017;Giraud, 2020).Working in combination via their hierarchical coupling (the lower-frequency rhythm modulates the amplitude of higher-frequency activity), these different families of neural oscillations can underpin specific cognitive operations (Canolty and Knight, 2010; Hyafil et al.,  2015b).In particular, u /g phase-amplitude coupling (PAC; Gross et al., 2013;Hyafil et al., 2015a,b;Lizarazu et al., 2019) enables the hierarchical encoding of the phonemic structure within syllables, while b /g PAC is associated with specific operations in predicting and planning speech (Chao et al., 2018).
In typically developing (TD) children, the ability to track the speech temporal structure develops very early (Leong et al., 2017;Jessen et al., 2019;Attaheri et al., 2022;Cabrera and Gervain, 2020;Ortiz Barajas et al., 2021).The infant auditory system can already track the syllabic rhythm in nursery rhymes (Leong et al., 2017;Attaheri et al., 2022) and native language sentences (Ortiz Barajas et al., 2021).Even 1-to 8-d-old newborns can detect a consonant change on the sole basis of speech envelope cues (Cabrera and Gervain, 2020).Importantly, oscillation cross-frequency coupling (d -u /low-g ), which orchestrates the encoding of phonemic structure within syllables (Hyafil et al., 2015a), is found in infants younger than 11 months (Attaheri et al., 2022).All together, these findings suggest that the cortical tracking of speech rhythms is present at birth and likely contributes to phonological learning (Goswami, 2019).Its early disruption could be a possible cause of speech decoding impairment in children with ASD (Giraud and Poeppel, 2012;Jochaut et al., 2015;Menn et al., 2022), and would participate in the altered development of their social skills.
Oscillation anomalies in response to speech have previously been reported in ASD, such as decreased g responses to rapid spectrotemporal transitions associated with diphones (Rojas et  al., 2008; Gandal et al., 2010), increased u responses to repeated tones and syllables (Wang et al., 2017;Zhang et al., 2018), and reduced b responses to novel sounds (Mamashli et al., 2017).Young adults with ASD (Jochaut et al., 2015) show joint anomalies of neural u (syllabic-level) and low-g (phonemic-level) activity, originating predominantly from the auditory cortex.In this brain region, the interplay between u and g neural activity was shown to be abnormal, suggesting that phonemic encoding was not temporally aligned with syllable tracking (Jochaut et al., 2015), a deficit that could deeply disrupt on-line speech reception (Hovsepyan et al., 2020).Although the observed u /g anomaly tightly correlated with the verbal scores of ASD participants (Jochaut et al., 2015), conclusions cannot be drawn on whether these oscillation coupling anomalies are specifically associated with the speech reception deficit.Furthermore, it is unknown whether these anomalies are present in very young children, when autistic symptoms usually become more evident, particularly during the development stage where typically developing children rapidly expand their speech repertoire (Press, 2015).
In the present study, we used high-density electroencephalography (EEG) to compare oscillatory neural processing of age-appropriate naturalistic speech in 64 very young children (1.31-5.56 years old) with and without ASD.Given the major developmental changes occurring in large-scale brain networks around this age in ASD (Nomi and Uddin, 2015), exploring speech reception as soon as the diagnosis is established is critical.If auditory cortical oscillations and their cross-frequency interactions are involved in language acquisition, we expect children diagnosed with ASD to already exhibit oscillation anomalies, notably in the d -u , low-b , and low-g bands (Fujioka et al., 2012;Meyer et al., 2017;Pefkou et al., 2017;Molinaro and Lizarazu, 2018).We further expect these anomalies to specifically predict the language reception status (and not other developmental traits).The precise characterization of oscillation anomalies which are expected to be causally involved in speech reception difficulties is an indispensable first step to envisage possible targeted interventions aiming at normalizing (boosting or downregulating) oscillatory activity in ASD (Kayarian et al., 2020;Marchesotti et al., 2020).

Inclusion procedure
Sixty-four children (mean age: 3.00 6 1.12, 17 females) with and without ASD drawn from the Geneva Autism Cohort (Franchini et al., 2018;Latrèche et al., 2021;Solazzo et al., 2021) underwent a large battery of tests, including EEG recordings while they watched movies (French cartoon aimed at young children: Trotro; Lezoray, 2013a,b,c,d).All children were recruited via specialized clinical centers or announcements in the community.The study was approved by the Ethics Committee of the Faculty of Medicine of the University of Geneva Hospital in accordance with the Declaration of Helsinki.A phone interview and a medical developmental history questionnaire were completed before the initial visit of each participant; for the typically developing children, the following were considered exclusion criteria: any suspicion of altered development, a history of neurologic or psychological disorder, and a family history of ASD in first-degree relatives.Parents gave their informed consent before inclusion in the study.The dataset selected for this study met the following criteria: the data were recorded during the participant's first visit, the participant's age was below six years old, the markers associated with movie onset were clear and accurate, there were exploitable raw data for four movies, the participant watched the screen during all the recordings.Given these strict inclusion criteria, we selected 64 children with and without ASD from the participants in the cohort.
For all ASD children, the clinical diagnosis was formally corroborated using the Autism Diagnostic Observation Schedule-Generic (ADOS-G; Lord et al., 2000) or the Autism Diagnostic Observation Schedule, second edition (ADOS-2; Lord et al., 2012).Data from 31 children with ASD (mean age: 3.09 6 0.91) and 33 age-matched typically developing peers (mean age: 2.92 6 1.30) were selected for further analyses (age difference: Kolmogorov-Smirnov D ¼ 0.27, p ¼ 0.19; see Tables 1, 2 and Extended Data Table 2-1 for participant characteristics).Table 1 summarizes the clinical characteristics of the ASD and the TD samples.Children with ASD scored significantly lower than their TD peers across all behavioral assessments, including language reception, language expression, visual reception, and fine motor skills.

Cognitive skills measure
Developmental functioning was assessed using the Mullen Scales of Early Learning (MSEL; Mullen, 1995), which comprise five subscales, namely gross motor, visual reception, fine motor, receptive language, and expressive language.The four latter scales are so-called "cognitive scales" and are used to derive an Early Learning Composite score as a measure of overall developmental functioning.The subscales of visual reception and fine motor skills measure nonverbal ability, while receptive language and expressive language measure the ability to process linguistic input and use the language productively.The subtest raw scores were converted into age-adjusted normalized developmental quotient (DQ) scores, obtained by dividing the age-equivalent scores by the child's chronological age and multiplying the result by 100.
The communication skills assessment, i.e., receptive language and expressive language, was completed by the Vineland Adaptive Behavior Scales, second edition (VABS-II; Sparrow et al., 2005), testing of adaptive function that does correlate with cognition.VABS-II is a parent interview which assesses social, communication, motor, and daily living skills, and provides age-equivalent and standard scores for a variety of summary scales and subscales, including expressive and receptive language and social adaptive functioning.Age-equivalent scores are reported.

Stimuli and procedure
We explored cortical speech processing during a passive, naturalistic task with a relatively low cognitive demand suitable for young children.Participants watched four French Trotro (Lezoray, 2013a,b,c,d) cartoon videos, each lasting ;2.5 min.The video presentation was controlled by Tobii Studio (Tobii Technology, Sweden).The screen size was 1200 pixels (29°389) in height and 1920 pixels (45°539) in width, with a 60-Hz refresh rate.Participants were seated ;60 cm away from the screen.The cartoon soundtrack was delivered via loudspeakers at a sound level adjusted for each participant.As the sound level was not monitored during the experiment but only set to a comfortable level, we cannot exclude sound level differences across groups.The soundtrack sampling rate was 44.1 kHz.We removed background noise, e.g., birds' singing, and music, from the original movie soundtrack using Audacity v.2.2.1 (Audacity Team, 2021) editing software (Fig. 1; Extended Data Fig. 1-1).We then extracted the stimuli envelopes using the absolute value of the analytic signal (Boashash, 2015).The speech envelope was down-sampled to 1000 Hz, and low-pass filtered using a zero-phase fourth-order Butterworth filter set at 40 Hz (Fig. 1A).An envelope spectral decomposition was performed using the fast Fourier transform (Fig. 1B).We found dominant frequencies between 1 and 7 Hz, with peaks at 1.17, 3.32, and 4.69 Hz, overlapping with the syllable rate range (four to six syllables per second; Fig. 1C) as determined by averaging peaks within the 150-ms minimum-peak-distance that were associated with the averaged French-syllable duration (Pellegrino et al., 2011).The spectrogram of the example sentence: "je veux ta bougie rigolote en échange" (I want your funny candle in exchange), shown in Figure 1D, was calculated using the MATLAB function "spectrogram" (The MathWorks).
EEG acquisition and preprocessing EEG data were acquired using a Hydrocel Geodesic Sensor Net (HCGSN, Electrical Geodesics) EEG system with 129 scalp electrodes at a 1000-Hz sampling rate.The recording reference electrode was located on the vertex (Cz), and a real-time bandpass filter at 0-100 Hz was applied to the incoming signal.The first two cartoon videos were presented in the first block, and the last two cartoon videos in a second block, with a ;5 min "dynamic image" task (Sperdin et al., 2018) in between cartoon videos and a 10-min break between blocks.At the end of the first block, the electrode impedances were measures, and if needed, the electrodes' conductance was adjusted by applying conductive gel to keep impedances below 40 kX.
The EEG signal preprocessing was conducted using the EEGLAB v2019 toolbox within the MATLAB environment (Delorme and Makeig, 2004) and Cartool (https://sites.google.com/site/cartoolcommunity/).First, the dataset underwent a dimensionality reduction to 110 channels by excluding cheek and neck electrodes that are often contaminated by muscle artifacts.Then, a zero-phase fourth-order Butterworth bandpass filter between 0.1 and 70 Hz was applied to the EEG signals, as well as a notch filter at 50 Hz to remove power line interference.Each participant's dataset was then visually inspected to exclude periods contaminated by movement artifacts.An independent component analysis (ICA) was computed on the dataset to identify and remove components with eye blinks, saccades, electrical line noise, and heartbeat artifacts.The dimensionality of the components was equivalent to the number of electrodes kept after bad channel detection.ICA was performed in MATLAB and Cartool.The first step was the detection of bad channels, i.e., with too large signal amplitude because of defective electrodes.The remaining data were decomposed into independent components using the Runica algorithm (EEGLAB function).Based on the ICA components' time course and topographies, noisy components were visually determined and excluded.
Subsequently, a spherical spline interpolation was used to interpolate the channels contaminated by noise using the ICA-corrected data.Finally, the Cartool spatial filter (Michel and Brunet, 2019) was applied, and a common average reference was recalculated on the cleaned data.The spatial filter is an instantaneous filter that removes local outliers by spatially smoothing the map without losing its topographical characteristics.It was run in the following way: (1) for each electrode, the values of  (2) the seven data points are sorted; (3) the minimal and maximal values are removed by dropping the first and last items; (4) the remaining five values are averaged, with weights proportional to the inverse distance to the central electrode, which is given a weight of 1 (Michel and Brunet, 2019).
A trial was defined based on the beginning and end of each speech chunk in the cartoon, leading to a total of 50 trials with an average duration of 1.69 s.The detected artifact periods were replaced by NaNs (not a number).After excluding artifact time segments, fewer trials remained in the ASD group than in the TD group (unpaired t test comparison t (62) ¼ À2.54, p ¼ 0.014).There were 35.12 6 1.88 (mean 6 SD, range 28-37) trials in the TD group, against 32.97 6 4.48 (mean 6 SD, range 21-36) in the ASD group.The mean trial duration was similar across groups (unpaired t (62) ¼ 0.98, p ¼ 0.33).The mean trial duration was 2454.4 6 92.15 ms (mean 6 SD) in the ASD group and 2431.8 6 93.02 ms (mean 6 SD) in the TD group.

Eye-tracking acquisition and analysis
To assess whether the observed effects were because of differences in the way participants explored the visual scene (gaze focus differences), we collected gaze data using a Tobii TX300 eye-tracking system (https:// www.tobiipro.com)with a 300-Hz sampling rate.The cartoon frames allowed for a visual angle of 26°479 Â 45°539 (height Â width).A fivepoint calibration procedure consisting of child-friendly animations was performed using an inbuilt program in the Tobii system.The calibration procedure was repeated if the eye-tracking device failed to detect the participant's gaze position accurately.The lighting conditions in the testing room were constant for all acquisitions.Younger participants sat on their parent's lap to make them feel comfortable and minimize head and body movements.All participants watched all four cartoon videos in the same order.To extract fixations, we used the Tobii IV-T Fixation filter (Olsen, 2012;Kojovic et al., 2022).

Gaze divergence estimation
To assess potential group differences in visual exploration, we first followed Kojovic et al. (2022) approach to create a gaze distribution map for each frame of the movie(s).A Gaussian kernel (adaptive bandwidth) was applied to each pair of gaze coordinates, and the results were added to obtain an estimation of gaze density (Botev et al., 2010).This step was run with the MATLAB function akde.The probability of gaze allocation  at a given point of the visual scene was represented by the obtained density estimation (Fig. 2A).Next, we used the earth mover's distance (EMD; Rubner et al., 2000;Rohrbein et al., 2015;Cho et al., 2016;Orlova et al., 2016;Yilmaz, 2021), known in mathematics as the Wasserstein metric, to measure between-group differences in gaze distribution in terms of location, i.e., x-y grid, and frequency, i.e., the obtained density.
To address whether gaze patterns statistically differed between the ASD and TD groups, we used a permutation test.First, we shuffled the labels for the ASD and TD groups to create a null distribution for the earth mover's distance (EMD), a measure of the distance between the two gaze distributions.We repeated this process 200 times to obtain an empirical null distribution for the EMD.It is important to note that two distinct types of distributions are being referred to here: the distributions of gaze for the ASD and TD groups, and the empirical null distribution of EMDs between these two distributions.To compute the one-tail probability, we ranked the observed EMD within the empirical null distribution.A rank .195(top 5%) within the 200-element null distribution was considered significant.The approach was applied frame-by-frame to obtain single frame gaze-distribution difference (Fig. 2B), and a cumulative gaze-distribution over frames in the whole speech excerpt (Fig. 2C, top) and the cumulative gaze-distribution difference (Fig. 2C, bottom).Although gaze distribution did not significantly differ between groups within each frame (Fig. 2B), a significant effect was found on the cumulative gaze distribution (Fig. 2C); the observed EMD (red bar, i.e., observed EMD ¼ 1.39 in Fig. 2C) is distinct from the 95% of the null distribution (Fig. 2C, blue bar).We then assessed individual gaze divergence in ASD by calculating the distance between every ASD gaze and the TD gaze norm on each frame.For comparison and interpretation, the distance was normalized to 0-1 and named Proximity Index (PI; Kojovic et al., 2022).A high PI value denotes that the visual exploration of an individual for a given frame is closer to the norm (more TD-like).A summary measure of divergence in visual exploration from the TD group was obtained by averaging PI values over all frames in each speech excerpt (Fig. 2D). Figure 2D indicates that for every child with ASD, the gaze pattern diverges (PI) from the TD group to a different extent.The PI values were included as covariates to reduce the gaze pattern bias in EEG signals.

EEG time-frequency analysis
To identify the oscillatory responses to speech, a Morlet wavelet transformation was performed from 5 s before to 5 s after each speech chunk (making sure the epoch would not include any artifacts) onset between 0.1 and 50 Hz with five cycles for Gaussian taper at each EEG electrode.
Then time course between 1 s preonset to 1 s postonset was selected (based on those speech chunks with the shortest duration) on a trial-bytrial basis, and the power was averaged across trials and normalized by decibel conversion (dB) over a À1000to 0-ms baseline period (for which an audio stimulus was presented but no speech), allowing for between-group comparisons.EEG power in the different frequency bands of interest was defined as the mean power value across 0-1000 ms postspeech stimulus onset.We compared ASD and TD groups in four frequency bands that are relevant for speech processing, i.e., d (1-3 Hz), u (4-8 Hz), b (12-20 Hz), and low-g (25-35 Hz).

Speech envelope prediction from EEG power modulation: multiple linear regression model (MLR) with distributed lags
To probe differences between ASD and TD in oscillatory speech tracking, a multiple linear regression (MLR) model with a distributed lag between À300 and 300 ms with 50-ms steps (Fig. 3) was used to reconstruct the stimuli envelopes from the neural responses.The EEG signals were lagged to compensate for possible differences in temporal alignment between the brain response and the stimulus, a widely used method in the literature (Crosse et al., 2016;Di Liberto and Lalor, 2017;Fiedler et al., 2017;Di Liberto et al., 2018).A multiple linear regression model was trained on the resulting set using a 10-fold cross-validation approach.First, all speech chunks were divided into ten consecutive segments.In each fold, one segment was left out for testing and the remaining segments were used for training.This process was repeated 10 times, ensuring that each segment was only used once for testing.On each fold, a multiple linear model was used to find a linear combination among the brain signals that best predicted the time course of the speech envelope.The resulting model was tested by correlating the predicted envelope with the actual speech envelope in the test segment.The final result, representing the oscillation tracking index, was obtained by averaging Fisher's z-transformed correlations in each fold and then taking the inverse Fisher's transformation of the resulting mean z-score.The model was trained separately for each individual subject.
We deployed the approach separately in the two frequency bands implicated in speech tracking, i.e., u and d , and computed the amplitude of specific bands to obtain the oscillation tracking index.The amplitude was the absolute value of the Hilbert transform in band-specific filtered EEG, i.e., 4-8 Hz for u and 1-3 Hz for d .

Phase-amplitude coupling
The time-courses from electrode clusters selected from the EEG power and neural tracking analyses were examined for phase-amplitude coupling changes (PAC).The first step was applying the Hilbert transform to each bandpass Butterworth filtered trial (Le Van Quyen et al., 2001) to obtain low-frequency phase (f p ) and high-frequency amplitude (f a ).Considering the filters for extracting f a must be wide enough to capture the center frequency 6 the modulating f p to detect PAC (Dvorak and Fenton, 2014;Aru et al., 2015), we decided to use a variable bandwidth, defined as center frequency 6 the modulating center frequency, to improve PAC detection.The f p bandwidth was kept narrow (center frequency 6 1Hz) to extract sinusoidal waveforms.Furthermore, changes in the modulation power spectrum between speech and baseline periods were visually inspected in each participant to confirm that oscillation peaks/troughs were present at each modulating frequency-band of interest f p.For instance, if the interested modulating frequency band was 4-8 Hz, we confirmed that participants presented a real peak in the power spectrum at 4-8 Hz.Next, the coupling between f p and f a was quantified using the Kullback-Leibler modulation index (Tort et al., 2010).The KL-MI-Tort approach estimates PAC by quantifying the amount of deviation in amplitude-phase distributions.That is breaking f p into 18 bins, and then calculating the mean amplitude within each phase bin, finally normalized by the average value across all bins.The modulation index is calculated by averaging the distance of the observed amplitude-phase distribution (P) and a uniform amplitude-phase distribution (Q).
Using KL-MI-Tort, we calculated PAC between phases 2-15 Hz (in 1-Hz steps) and amplitudes 16-50 Hz (in 2-Hz steps) for the time period 0-1000 ms following speech onset and a 1000 ms prestimulus baseline period.MI values were calculated separately for each trial and averaged to obtain a single MI value per amplitude and phase.To normalize MI values, this was repeated using surrogate data, created by shuffling trial and phase-carrying information (200 surrogates).

Predicting clinical variables from oscillatory features
In order to show the relationship between the brain activity and the neurophysiological variables, we tested whether autism severity was predicted by using only band-specific power per electrode (i.e., d , u , b , and low-g power), or only neural tracking values per electrode, or only phase-amplitude coupling matrices per cluster (i.e., maximum MI value and corresponding phase-frequency, amplitude-frequency).For this, we trained a Linear Discriminant Analysis (LDA) classifier using a 10-fold nested cross-validation procedure, which separates the data into test and training sets.The training set was further separated using a 5-fold cross-validation approach for parameter search (Grid search).The LDA classifier was trained using a diagonal shared covariance matrix.The cross-validation process ensured that the training and testing datasets were not overlapping, avoiding misleading results because of overfitting.The input to the classifier was the band-specific EEG power, neural-tracking, and PAC for each participant, and the label to be predicted was symptom severity (i.e., low, moderate, and high).All empirical thresholds were obtained through cumulative binomial distribution (Combrisson and Jerbi, 2015).
Then, we probed the relationship between the neural and cognitive skills using a regularized linear model per frequency band (i.e., d , u , b , and low-g power), per neural tracking per electrode, and per PAC matrix (i.e., maximum MI value and corresponding phasefrequency, amplitude-frequency) to determine which critical oscillation (or combination of oscillations) was the best predictor of cognitive skills within group, e.g., the higher the tracking value, the higher the speech reception.The linear model was based on Lasso regression which requires finding the best hyper-parameters within high dimensional data (Tibshirani, 1996).We also used a 10-fold nested crossvalidation approach to improve model selection.The training set was further separated using a 5-fold cross-validation approach for parameter search (Grid search).The results were presented as averaged R 2 values, which indicate the prediction power of a given feature, i.e., higher R 2 s signal higher prediction accuracy.

Statistical analysis
Between-group statistical comparisons of band-specific EEG power and neural tracking were done using cluster-based nonparametric permutation tests with Monte Carlo randomization (Maris and Oostenveld, 2007) using the FieldTrip toolbox (Oostenveld et al., 2011; http:// fieldtriptoolbox.org)with gaze divergence as a covariate to remove any possible bias because of gaze differences between groups.In detail, between-group differences (ASD vs TD) were assessed with unpaired tstatistics.First, clusters of significant group differences were obtained by considering at least two adjacent electrodes whose t value exceeded a 5% significance threshold (ASD vs TD, unpaired t test, uncorrected for multiple comparisons).The maximum t value within each cluster was carried forward.Next, a null distribution was obtained by randomizing the gaze divergence label 1000 times and calculating the largest cluster-level t value for each permutation.The maximum t value within each original cluster was then compared against this null distribution, with values exceeding a threshold of p , 0.05 deemed significant.
In addition, between-group differences of neural tracking were tested by means of unpaired t test for all electrodes collectively analyzed.For group comparison statistics, only one value of u -/d -neural tracking per Figure 3. Schematics of the multiple linear regression (MLR) model approach.The filtered EEG signal and stimulus envelope were entered into the MLR.For the cross-validation procedure, envelopes of speech and EEG signal of 9-fold with time-lag shifting were used to fit an MLR for each participant, which was then used to predict the envelope of the 10th speech fold.The resulting model was tested by correlating the predicted envelope with the actual speech envelope in the test segment.
participant was included in the unpaired t test comparing ASD and TD groups.
To assess changes in the comodulograms of PAC between the speech and baseline periods, we processed the nonparametric cluster-based statistics (Maris and Oostenveld, 2007).First an uncorrected dependentsamples t test was performed (speech vs baseline), and then MI values exceeding a 5% significance threshold of null distribution were grouped into clusters.

Speech-related oscillatory changes
To analyze EEG power and possible probe differences in neural activity in response to speech across groups (Fig. 4A), we computed the EEG power spectrum of speech compared with baseline in several frequency bands of interest, i.e., d (1-3 Hz), u (4-8 Hz), low-g (25-35 Hz), and b (12-20 Hz).A betweengroup statistical comparison (Fig. 4B; Extended Data Fig. 4-1), with gaze divergence as a covariate, showed reduced d , b , and low-g band oscillatory activity (mostly on mid-central clusters) in children with ASD.In contrast, u oscillations were comparable to their TD peers.

Neural tracking of speech envelope using multiple linear regression model
We then explored the differences between ASD and TD in the neural tracking of the cartoon soundtrack's phrase-level and syllable-level modulations.The full segment duration differed across participants and across groups.The mean duration was 80,935 6 11 524 ms (mean 6 SD) in the ASD group, and 85,653 6 6712.7 ms (mean 6 SD) in the TD group.The model was trained by a 10-fold cross-validation approach, meaning that the duration of each fold was around 8093.5 ms for ASD and 8565.3 ms for TD.The duration in both groups was enough to get a valid value.Please note that the speech content is not expected to affect low-frequency neural tracking.Therefore, although the balance across groups was not perfect, the result remains reliable.
We found that the speech envelope reconstruction was significantly less accurate in ASD participants using the u band signal (unpaired t (62) ¼ 2.19, p ¼ 0.03, h 2 ¼ 0.07; Fig. 5A).This u effect was most prominent in a specific cluster of 12 posterior-occipital electrodes (Fig. 5B).The d -band signal from all electrodes permitted the reconstruction of the stimulus envelopes equally well in ASD and TD participants (unpaired t (62) ¼ 0.18, p ¼ 0.86, h 2 ¼ 0.0005; Fig. 5A).However, we found reduced speech envelope reconstruction accuracy from the d -band signal in ASD participants relative to TD participants in a specific cluster of seven parieto-occipital electrodes (Fig. 5B).These results denote that although the overall u power was unchanged in children with ASD (Fig. 4), the neural tracking of the speech syllabic structure by d -range and u -range neural activity was altered.

Phase amplitude coupling
As speech is encoded hierarchically by different families of nested neural oscillations, we also analyzed phase-amplitude coupling (PAC) across modulating (,15 Hz) and modulated (16-50 Hz) frequencies using the KL-MI-Tort.This approach was applied to those clusters showing significant between-group differences in the EEG power (all electrodes of clusters in the mid-central area; see Fig. 4) and in the neural tracking analyses (u -tracking cluster; Figure 4B).Before computing PAC and assessing PAC changes, we checked that there were clear power spectrum peaks and troughs at each modulating frequency band of interest (Dvorak and Fenton, 2014;Aru et al., 2015).Despite large interindividual variability in peak/trough frequencies and oscillatory power, average neural activity confirmed significant power peaks/troughs in the low-frequency range in both groups (Fig. 6A,C; Extended Data Figs.6-1, 6-2), enabling us to compute comodulograms for speech and baseline EEG, and to compare them using cluster-based nonparametric statistics.Note that the potential effects of gaze patterns were considered in the statistical analyses and removed.The reported effects hence primarily denote auditory processes.
Although results from mid-central electrodes showed significant PAC clusters between the 3-to 8-Hz (phase) and the 22-to 38-Hz (amplitude) frequency ranges in both groups, there was only minimal overlap between ASD and TD (Fig. 6B).While TD showed a unique and strong d -u /low-g PAC on mid-central (Fig. 6B) and posterior-occipital electrodes (Fig. 6D), very young children with ASD exhibited wholly different patterns.The most significant difference was the extra presence of a consistent lowb /low-g PAC in both mid-central and posterior-occipital electrodes (Fig. 6D).In summary, compared with the TD group, children with ASD show atypical d /u -g coordination, and a robust low-b /low-g coupling anomaly.
Prediction of clinical variables from oscillatory neural features Among all the oscillatory features that were found atypical in children with ASD, not all of them have the potential to specifically account for individual traits and, in particular, language abilities.We, therefore, assessed whether the group differences observed at the neurophysiological level could first predict the ASD severity, and, more importantly, whether they could predict the speech reception scores obtained by the children in the MSEL (direct assessment of developmental functioning) and VABS-II (parent-reported measure of functioning in everyday life).Conversely, we also sought to find out whether some of the oscillatory anomalies detected in power, tracking, and PAC analyses, could be more generally involved in several other cognitive skills (verbal production, visual processing, and fine motor skills).
In addition, low-b /low-g PAC was a good predictor of autism severity in both clusters [accuracy 49.2% in the mid-central cluster and 51.7% in the posterior-occipital cluster (Fig. 7B), empirical chance level: 46.387%].These results demonstrate that u power, g power, d tracking, b /g coupling contain critical information about autism severity.

Predicting speech reception level
We then used Lasso, with a nested cross-validation approach, to determine whether individual language skills could be predicted from oscillatory features and which features accounted most selectively for individual language development status assessed by MSEL (Fig. 8) and VABS-II (Fig. 9).The analyses were run separately in each group.Although different in TD and ASD groups, band-specific EEG power accounted for none of the cognitive subscales of MSEL in any groups, except expressive language in TD children for the g power and receptive language in children with ASD for the d power (Fig. 8A).Conversely, neural tracking, which was also markedly different across groups, was predictive of three cognitive subscales of MSEL in ASD (receptive and expressive language and fine motor skills) in the d range, and of all but receptive language in the u range (Fig. 8B).In the TD group, neural tracking accounted for none of the MSEL scores, despite a trend for u tracking to predict both language reception and expression.These results were also found for the language components as assessed by VABS-II (Fig. 9B).In TD, the R 2 values were very low for all cognitive aspects in the d range, and for motor and visual in the u range (Fig. 8B).
The most relevant feature for predicting language reception development was phase-amplitude coupling.As expected from studies in typical adults (Gross et al., 2013;Lizarazu et al., 2019) and from neurocomputational models (Hyafil et al., 2015a,b), u /g coupling selectively explained language reception in both central and posterior-occipital clusters of electrodes in our group of very young TD children (Figs.8C, 9C, 10).In the ASD group, the (atypical) b /g PAC selectively predicted language reception (Fig. 8C, 9C).The post hoc analysis of the dependency of language reception on b /g PAC indicated that the stronger the anomaly, the worse the speech reception was (Figs. 8D, 9D).For display purposes, after confirming the performance of the algorithm, we used whole data to run the hyperparameter optimization, and with the best estimator we then retrained the algorithm y ¼ f(x).We, therefore, obtained a fitted model y ¼ 17.45•f p 1 0.43•f a -5.71•MI-142.19 (for MSEL) and y ¼ 0•f p 1 0•f a -0.62•MI 1 12.26 (for VABS-II), in which y represents language reception, f p and f a represent the frequency of phase and amplitude respectively, and MI refers to PAC value.Overall, PAC was the most specific predictor of language reception in ASD and TD children: the presence of u /g PAC predicts good speech reception in TD children, whereas the added presence of atypical b /g PAC signals poor reception in ASD.Importantly, PAC features were much more sensitive than power and neural tracking to predict individual language reception scores.

Discussion
The goal of this study was to determine whether speechrelated oscillatory anomalies in ASD are already present in early childhood, around the time of ASD diagnosis.Given the established computational role of neural oscillations in chunking the syllable stream, encoding phonemic information, and predicting speech timing and linguistic content (Giraud and Poeppel, 2012;Hovsepyan et al., 2020), we also sought to determine which of several potential anomalies could most specifically be associated with speech reception difficulties in ASD.Establishing the specific relevance of oscillation anomalies with respect to speech development is critical, as early targeted neural interventions could subsequently be envisaged to normalize speech reception, as recently demonstrated in other neurodevelopmental language disorders (Ladanyi et al., 2020;Marchesotti et al., 2020).Exploring EEG in 64 children between 1.31 and 5.56 years old with balancing the gaze-divergence across groups, we found marked anomalies of speechinduced cortical activity in the group with ASD, including decreased expression of d , low-g , and b frequencies.While u power appeared as pronounced among ASD and TD children, d -u neural tracking was significantly reduced in ASD.Our most important results were observed in relation to oscillation cross-frequency coupling, which reflects the coordination computations at different timescales.As expected from previous studies in adults (Gross et al., 2013;Hyafil et al., 2015a;Jochaut et al., 2015;Lizarazu et al., 2019), we clearly detected the classical u /g coupling in very young (approximately three years old) TD children and found this feature to predict their individual language reception scores specifically.This result represents per se an important finding supporting the notion that PAC is not a simple marker of speech reception ability, but reflects a key computational component of speech processing, namely the hierarchical relationship between phonemes and syllables.This typical u /low-g PAC was altered in children with ASD, appearing over a shifted u /g range and lacking occipital location once the visual exploration contribution across groups was controlled for.Critically, u /g coupling was accompanied by a nontypical low-b /low-g PAC in the ASD group, which, among all the abnormal oscillatory features reported here, was the only one from which we could specifically predict language reception scores in children with ASD: the stronger the b /low-g coupling, the worse the speech reception.This key finding suggests that the speech processing computational scales are markedly different in ASD.We acknowledge, however, that because of larger interindividual variability in the neural sources among ASD children compared with TD controls, group-level statistics may appear weaker in ASD (Hasson et al., 2009).Source space analyses, which in our case were not possible because of the absence of structural MRI data and real digitized head points, could have led to slightly different observations.The cartoon video contained acoustic and visual features, triggering audiovisual integration.Given that modality preference can influence performance on multimodal tasks (Sumby and Pollack, 1954;Calvert et al., 1997;McDonald et al., 2000;Slutsky and Recanzone, 2001;Morein-Zamir et al., 2003;Feng et al., 2017;Lee et al., 2018;Robinson and Sloutsky, 2019), we cannot exclude that atypical speech perception in children with ASD is influenced by how individuals with ASD process visual information.Although our statistical analysis controlled for gaze patterns, differences between typical and ASD children with respect to cross-modal interactions are likely, and future research should investigate how visual information interacts with auditory neural entrainment in children with and without ASD.
Low-c power predicts language expression in typically developing children Our results showed decreased low-g activity among ASD relative to TD children, notably on mid-central electrodes, a scalp location that strongly captures auditory cortex activity (Stropahl et al., 2018).Previous studies already reported reduced g activity in ASD in response to pure tones (Rojas et al., 2008;Gandal et al., 2010), presumably denoting a basic functional anomaly of the auditory cortex.g activity usually reflects the excitation-inhibition balance (Buzsaki and Wang, 2012) within brain circuits, which is a core parameter in neural development.Reduced low-g power in ASD could be seen as a probe of atypical maturation of auditory neural circuitry.At the computational level, low-g activity is associated with phonemic encoding (Lehongre et al., 2011;Hyafil et al., 2015a) and its nesting within u rhythm is associated with the encoding of phonemic information within syllabic frames, enabling syllable-level representations to interface at the right time with other (higher) processing stages (Ghitza, 2011).Here, we found that low-g activity was the only feature that specifically predicted language expression among TD children, a sensible finding as language expression is tightly related to the transformation of phonetics into articulatory features at the same timescale (Chartier et al., 2018).
Abnormal h-range speech tracking and h/c coordination Although globally we found a similar level of u activity in both groups, u -range tracking was deficient among children with ASD, meaning that u activity, although present, did not typically follow the speech temporal structure.An equivalent level of global u power in TD and ASD is consistent with previous studies (Wang et al., 2017;Zhang et al., 2018).However, the fact that the speech envelope was less accurately reconstructed from the EEG signals among ASD compared with TD children indicates that u activity in ASD is more weakly engaged in syllable tracking (Giraud and Poeppel, 2012;Hyafil et al., 2015a;Pefkou et al., 2017).This anomaly likely disrupts the alignment of neuronal excitability with syllabic onset, weakens the coordination of speech with oscillations on other frequency bands, notably the low-g one, and ultimately hampers phonemic information encoding within syllables (Lehongre et al., 2011;Gross et al., 2013;Hyafil et al., 2015a;Lizarazu et al., 2019;Hovsepyan et al., 2020).Accordingly, we found anomalies of the classical speech-specific u /g PAC among children with ASD.This anomaly was not explained by differences between groups in the way children screened the visual scene, as gaze differences were controlled for.The coupling was shifted toward lower g frequencies, making it incompatible with a typical role in phonemic sampling within left auditory regions, as reported in healthy individuals (Poeppel, 2003;Morillon et al., 2010).These results, however, are only in partial agreement with the more severe anomalies of u /g coupling, a fully inverted coupling relationship, that were previously observed using simultaneous fMRI-EEG among adults with ASD (Jochaut et al., 2015).Longitudinal studies are needed to determine whether u /g coupling further deteriorates during childhood development and adolescence.
Reduced d-power but not d-range speech tracking is a specific predictor of language reception in ASD Our results also show reduced d power and d -range speech tracking among children with ASD.d -range activity signals phraselevel chunks, which do not necessarily have a physical/acoustic counterpart in the speech stimulus (Boucher et al., 2019).d Activity is known to reflect more endogenous processes than u -range syllable tracking (Molinaro and Lizarazu, 2018), which are argued to pertain to syntactic grouping (Ding et al., 2014;Meyer et al., 2017) or prosody processing (Ghitza, 2017;Teoh et al., 2019).The fact that reduced d activity predicted speech reception in ASD might therefore denote altered syntactic phrasal chunking and is compatible with previous observations suggesting both weaker linguistic (Haesen et al., 2011) and intonation processing (Benitez-Burraco and Murphy, 2016).Although our results align well with these previous observations and hypotheses, d -range speech tracking did not only predict speech reception scores but also expressive language, and fine motor skills (assessed with MSEL), suggesting the role of these oscillatory features in ASD goes beyond the sole domain of speech reception.Despite the collinearity across MSEL subscales, neural tracking also predicted language development when assessed with VABS-II.While u tracking characterized language development in TDs, individuals with ASD showed d tracking instead.Logically, the more endogenous d tracking deficit has a more global impact on the cognitive profile of children with ASD and could also be a parameter that could possibly be adjusted using adapted neurostimulation methods.
b/c Cross-frequency coupling: a speech reception singularity in young children with ASD During continuous speech perception, top-down predictive mechanisms are also important, particularly to make sense of acoustic signals that might be unclear or ambiguous, or simply to follow the speaker's speech rate.Typically, predictive mechanisms are signaled by the low-b band (Fontolan et al., 2014), which we found to be weaker among ASD compared with TD children around the mid-central region.Low-b neural activity is argued to mediate top-down information passing (Chao et al., 2018;Bastos et al., 2020), allowing for precise temporal (Fujioka et al., 2012) but also content-specific predictions (Chao et al., 2018).In speech processing, this frequency band could provide top-down integration constants that are intermediate between u -syllabic and g -phonemic ranges (Pefkou et al., 2017;Giraud, 2020).By alternating with bottom-up g phases (Fontolan et al.,  2014)  The key finding of this report is the atypical low-b /low-g PAC found among children with ASD in conjunction with weaker b power.A PAC anomaly involving the low-b phase might follow from the reduced global b power in ASD, possibly allowing low-g and g bursts to occur more strongly within b troughs.Reduced b gating could result in letting past neural activity that is unrelated to the predicted speech structure, possibly leading to a feeling of being overwhelmed by unformatted acoustic inputs to which no linguistic value or meaning can be attributed.The presence of an abnormal low-b /low-g coupling in ASD was not influenced by atypical visual scene exploration in ASD, as possible gaze differences were controlled for in the statistical analysis.This interpretation aligns well with the clinically described auditory avoidance in ASD children (Haesen et al., 2011;Top et al., 2018).The other hypothesis would be the existence of a stronger b state where new sensory input is less easily incorporated, and where subjects are in a dominant "internally driven state," similar to schizophrenic symptoms (Grace, 1991).According to the communication-through-coherence theory (Fries, 2005, 2015), this remarkable b -g PAC anomaly could be associated with the persistence of the status quo state and thus less flexibility in cognitive control (Engel and Fries, 2010;Herring et al., 2019;Wagner et al., 2019).The low-b /low-g coupling may suggest a greater endogenous/exogenous processing ratio in ASD, particularly when exposed to speech stimuli.The observed abnormalities in neural speech processing could result from strong endogenous neural patterns, leaving subjects locked into existing "internally driven states" with a reduced capacity to shift to and integrate novel sensory information (Pogosyan et al., 2009).The difficulty in processing new and fast-changing sensory stimuli is likely associated with a reduced amount of downpropagating prediction errors.

Cross-frequency oscillation features: a potential endophenotype for targeted interventions
As cortical oscillations arise from excitatory-inhibitory interactions within and across specific cortical laminae (Cannon et al., 2014), auditory oscillation anomalies represent a plausible functional counterpart to structural disorganization and disruption of cortical inhibition previously shown in ASD (Rojas et al., 2008;Gandal et al., 2010).A recent study directly relates neural oscillations with the expression of numerous genes, several of which are involved in ASD (Berto et al., 2021), e.g., LNX1, DGKI, KCNQ5, DCX, SHANK2, etc.Therefore, speech reception difficulties in ASD could directly result from structural anomalies induced by mutations of genes (Caubit et al., 2016) controlling neuronal interactions, notably at the synaptic level (Caubit et al., 2016).
Disruptions of synchronous neural activity in the cortex (Dinstein et al., 2011) and other brain structures, such as the cerebellum and hippocampus (Donovan and Basson, 2017), could lead to fragmented speech processing, abstraction difficulties in the auditory modality, and difficulties to map atypical auditory representations into appropriately timed articulatory sequences.Here, we found that among many neural oscillation anomalies, the most promising features lie in cross-frequency coupling patterns, particularly the u /g coupling that has an abnormal topography in ASD and the low-b /low-g coupling that is wholly atypical.These two anomalies could be an ideal entry point for targeted brain stimulation interventions aiming at downregulating the abnormal low-b /low-g coupling, re-localizing the typical auditory u /low-g coupling to left auditory regions, and upregulating u /g coupling that is classically observed in the superior temporal cortex region.The next step in this line of research will be to test whether a simple u /g stimulation at the exact right frequencies (5 Hz by 30 Hz) using, e.g., mild transcranial alternating current stimulation could indeed both disrupt low-b /low-g and relocate u /low-g activity to auditory regions.In combination with close monitoring of behavior and EEG activity, such a trial would also allow us to firmly establish a direct link between auditory oscillatory activity and the speech reception ability in children with ASD.
In conclusion, our results confirmed the relevance of u /low-g coupling in speech reception in very young TD children and showed that cross-frequency patterns were markedly disrupted in children diagnosed with ASD.While u /g coupling closely predicted individual language reception abilities in TD children, the speech reception deficit in ASD was predicted by the unusual presence of b /g coupling in auditorysensitive brain regions.Cross-frequency coupling features appear as a promising language development endophenotype, bridging the gap from genetics to behavior while offering a precise entry point for interventions targeting the normalization of cross-frequency oscillatory functions.Although our study alone cannot establish causality, demonstrating the presence of oscillatory anomalies at the age of diagnosis is an important step toward reaching a causal explanation for speech reception difficulties in ASD.

Figure 1 .
Figure1.Stimulus properties.A, Speech waveform and envelope.In black: original audio track and waveform; in blue: speech envelope.B, Power spectrum of speech envelope.The average power spectrum across all speech samples shows dominant frequencies between 1 and 7 Hz, with 1.17-, 3.32-, and 4.69-Hz peaks.C, Distribution of syllable rate across all speech samples.D, Spectrogram of a speech sample, the sentence "Je veux ta bougie rigolote en échange" (I want your funny candle in exchange).Also, see Extended Data Figure1-1 for preprocessing pipeline.

Figure 2 .
Figure 2. Gaze divergence estimation.A, An example frame with gaze distribution in TD (delimited by black contours) and ASD (red contour) children.B, Top, Example of distribution of EMD in one frame.(bottom) EMD values in all frames.Red: observed EMD; blue: the 95% confidence interval (CI) of the null distribution.If the observed EMD is bigger than 95% CI, the gaze distribution is significantly different across groups.C, Top, Cumulative gaze distribution (red contour: ASD, black contour: TD) over all frames and all speech excerpts.Bottom, Distribution of the permuted EMD (black histogram) and the observed EMD (red bar) of the contours in C, top.D, Mean Proximity Index (PI) values for all 31 children with ASD and all speech excerpts.

Figure 4 .
Figure 4. Speech-related oscillation changes.A, Scalp topographies of power in several frequencies of interest in ASD and TD groups.B, Comparison of frequency-power among 31 children with ASD and 33 TD peers.Children with ASD had reduced d, b, and low-g power and comparable u power relative to their TD peers.Asterisks in scalp topographies indicate group differences after cluster correction (cluster-based nonparameters permutation tests, cluster corrected p ¼ 0.05).Also, see Extended Data Figure 4-1 for a comparison of a frequency-power in both groups.

Figure 5 .
Figure 5. Neural tracking of speech envelope.Correlation coefficients between the reconstructed and real envelope, in all channels together (A) and each single-channel separately (B).Bi, Correlation coefficient topographies in both groups.Bii, Correlation coefficients were significantly reduced in ASD relative to TD group in the u band.Error bars in A represent standard error (SE) of correlation coefficients and shadows in A represent the neural tracking distribution in each group.Asterisks in B show group differences from nonparametric clusterbased permutation tests, pp , 0.05, ppp , 0.01, pppp , 0.001.

Figure 6 .
Figure 6.A, C, Speech-induced oscillatory power is characterized by a strong power peak in low-frequency bands (1-10 Hz) in both groups and a marked power trough in the low-b band (10-15 Hz) in the ASD group over (A) central electrodes selected from EEG power group differences and (C) posterior-occipital electrodes selected from neural tracking group differences.B, D, Phase-amplitude comodulograms produced by statistically comparing the coupling values represented by modulation index (MI) values in the speech and baseline periods over central electrodes (B) and posterior-occipital electrodes (D).Dotted lines represent significant differences in phase-amplitude coupling.For exact cluster locations, see topographies on A, C. For a quick appraisal of f p and f a ranges in each group, see the rightmost panel (f p : frequency of phase; f a : frequency of amplitude, nonparametric cluster-based statistics, cluster-corrected p , 0.05).Also, see Extended Data Figures 6-1 and 6-2 for individual speech-induced oscillatory power in both groups.

Figure 7 .
Figure 7. Predicting ASD severity from EEG oscillatory activity.A, Prediction accuracy of ASD symptom severity using the EEG power, neural tracking based on all electrodes.B, Prediction accuracy of ASD symptom severity using the phase-amplitude coupling (clusterbased).The red dash line shows the chance level determined by an inverse binomial distribution.For the exact location of clusters, see scalp topographies on top of panel B.

Figure 8 .
Figure 8. Predicted development levels in young children from EEG oscillatory activity using a regularized linear model (Lasso).A, B, Low-g power significantly predicts Language expression in TD (A), d -/u -tracking significantly predicts all tested cognitive components but language reception for u tracking in ASD, and none in TD (B).C, u /low-g PAC specifically predicted language reception in TD (C), whereas b /low-g specifically predicted language reception among young children with ASD (C, left panel).R 2 values represent the proportion of the variance that is explained by the features for each target variable.D, Language reception prediction from b -g coupling (r ¼ À0.33, p ¼ 0.04).Asterisks indicate the significant R 2 , p , 0.05; d refers to d , u refers to u, b refers to low-b, g refers to low-g .For the exact locations of clusters, see the scalp topographies on top of each panel in C. The missing bars indicate the R 2 is close to zero.
, b activity phases could stabilize representations in the face of the ever-changing acoustic input.Therefore, weaker b activity in ASD might reflect a reduced deployment of predictive mechanisms, altering both the ability to predict when acoustic signals can be expected and what content they carry.This interpretation is globally in line with alterations of phasic predictive learning in a mouse model of autism (Kosaki and Watanabe, 2016) and more generally with the hypothesis of impaired predictive coding in ASD (Courchesne and Allen, 1997; Sinha et al., 2014; Van de Cruys et al., 2014).

Figure 9 .
Figure 9. Predicted VABS-II (Vineland Adaptive Behavior Scales) development level in young children from EEG oscillatory activity using a regularized linear model (Lasso).A, B, d And b power significantly predict language reception in TD (A), and d tracking significantly predicts language reception in ASD (B).C, u /low-g PAC specifically predicts language reception in TD (C), whereas b /low-g specifically predicts language reception among young children with ASD (C, left panel).R 2 values represent the part of the variance that is explained by the features for each target variable.D, Language reception prediction from b -g coupling (r ¼ À0.45, p ¼ 0.01).Asterisks indicate significant R 2 , p , 0.05; d refers to d, u refers to u, b refers to low-b , g refers to low-g .For the cluster locations, see corresponding scalp topographies on each panel C.The missing bars indicate the R 2 is close to zero.

Table 1 .
Participants' demographic information and group comparison of behavioral tests

Table 2 .
Psychometric data of every participant (Table continues.)the 6 closest neighbors are determined, plus the central electrode value itself;

Table 2 .
Continued See Extended Data Table2-1 for psychometric data of every participant.